Search CORE

13 research outputs found

A view of Estimation of Distribution Algorithms through the lens of Expectation-Maximization

Author: Brookes David H.
Busia Akosua
Fannjiang Clara
Listgarten Jennifer
Murphy Kevin
Publication venue
Publication date: 11/06/2020
Field of study

We show that a large class of Estimation of Distribution Algorithms, including, but not limited to, Covariance Matrix Adaption, can be written as a Monte Carlo Expectation-Maximization algorithm, and as exact EM in the limit of infinite samples. Because EM sits on a rigorous statistical foundation and has been thoroughly analyzed, this connection provides a new coherent framework with which to reason about EDAs

arXiv.org e-Print Archive

Prediction-Powered Inference

Author: Angelopoulos Anastasios N.
Bates Stephen
Fannjiang Clara
Jordan Michael I.
Zrnic Tijana
Publication venue
Publication date: 16/02/2023
Field of study

We introduce prediction-powered inference \unicode{x2013} a framework for performing valid statistical inference when an experimental data set is supplemented with predictions from a machine-learning system. Our framework yields provably valid conclusions without making any assumptions on the machine-learning algorithm that supplies the predictions. Higher accuracy of the predictions translates to smaller confidence intervals, permitting more powerful inference. Prediction-powered inference yields simple algorithms for computing valid confidence intervals for statistical objects such as means, quantiles, and linear and logistic regression coefficients. We demonstrate the benefits of prediction-powered inference with data sets from proteomics, genomics, electronic voting, remote sensing, census analysis, and ecology.Comment: Code is available at https://github.com/aangelopoulos/prediction-powered-inferenc

arXiv.org e-Print Archive

Augmenting biologging with supervised machine learning to study in situ behavior of the medusa Chrysaora fuscescens

Author: Cones Seth
Fannjiang Clara
Katija Kakani
Mann David
Mooney T. Aran
Shorter K. Alex
Publication venue: 'The Company of Biologists'
Publication date: 23/08/2019
Field of study

© The Author(s), 2019. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Fannjiang, C., Mooney, T. A., Cones, S., Mann, D., Shorter, K. A., & Katija, K. Augmenting biologging with supervised machine learning to study in situ behavior of the medusa Chrysaora fuscescens. Journal of Experimental Biology, 222, (2019): jeb.207654, doi:10.1242/jeb.207654.Zooplankton play critical roles in marine ecosystems, yet their fine-scale behavior remains poorly understood because of the difficulty in studying individuals in situ. Here, we combine biologging with supervised machine learning (ML) to propose a pipeline for studying in situ behavior of larger zooplankton such as jellyfish. We deployed the ITAG, a biologging package with high-resolution motion sensors designed for soft-bodied invertebrates, on eight Chrysaora fuscescens in Monterey Bay, using the tether method for retrieval. By analyzing simultaneous video footage of the tagged jellyfish, we developed ML methods to: (1) identify periods of tag data corrupted by the tether method, which may have compromised prior research findings, and (2) classify jellyfish behaviors. Our tools yield characterizations of fine-scale jellyfish activity and orientation over long durations, and we conclude that it is essential to develop behavioral classifiers on in situ rather than laboratory data.This work was supported by the David and Lucile Packard Foundation (to K.K.), the Woods Hole Oceanographic Institution (WHOI) Green Innovation Award (to T.A.M., K.K. and K.A.S.) and National Science Foundation (NSF) DBI collaborative awards (1455593 to T.A.M. and K.A.S.; 1455501 to K.K.). Deposited in PMC for immediate release

Woods Hole Open Access Server

Optimal arrays for compressed sensing in snapshot-mode radio interferometry

Author: Clara Fannjiang
Publication venue: 'EDP Sciences'
Publication date: 18/11/2013
Field of study

Context. Radio interferometry has always faced the problem of incomplete sampling of the Fourier plane. A possible remedy can be found in the promising new theory of compressed sensing (CS), which allows for the accurate recovery of sparse signals from sub-Nyquist sampling given certain measurement conditions. Aims. We provide an introductory assessment of optimal arrays for CS in snapshot-mode radio interferometry, using orthogonal matching pursuit (OMP), a widely used CS recovery algorithm similar in some respects to CLEAN. We focus on comparing centrally condensed (specifically, Gaussian) arrays to uniform arrays, and randomized arrays to deterministic arrays such as the VLA. Methods. The theory of CS is grounded in a) sparse representation of signals and b) measurement matrices of low coherence. We calculate the mutual coherence of measurement matrices as a theoretical indicator of arrays’ suitability for OMP, based on the recovery error bounds in Donoho et al. (2006, IEEE Trans. Inform. Theory, 52, 1289). OMP reconstructions of both point and extended objects are also run from simulated incomplete data. Optimal arrays are considered for objects represented in 1) the natural pixel basis and 2) the block discrete cosine transform (BDCT). Results. We find that reconstructions of the pixel representation perform best with the uniform random array, while reconstructions of the BDCT representation perform best with normal random arrays. Slight randomization to the VLA also improves it dramatically for CS recovery with the pixel basis. Conclusions. In the pixel basis, array design for CS reflects known principles of array design for small numbers of antennas, namely of randomness and uniform distribution. Differing results with the BDCT, however, emphasize the need to study how sparsifying bases affect array design before CS can be optimized for radio interferometry

EDP Sciences OAI-PMH repository (1.2.0)

Recommended from our members

Toward Trustworthy Scientific Inquiry and Design with Machine Learning

Author: Wong-Fannjiang Clara
Publication venue: eScholarship, University of California
Publication date: 01/01/2023
Field of study

The last decade has witnessed rapid development and deployment of machine-learning systems across science. Such systems can supply predictions about scientific phenomena far more quickly and cheaply than gold-standard experiments, and are being used in efforts to both discover scientific knowledge and design new biomolecules. However, an important question remains unanswered: since machine-learning systems make errors, how can we use them in a trustworthy way for scientific discovery and design? This dissertation takes steps toward helping to ensure that the biomolecules we design and the scientific conclusions we draw using machine learning can be trusted.We begin in the setting of machine learning-based design. The goal in this setting is to propose novel objects such as proteins, small molecules, or materials with desired properties, in a way that is guided by machine-learning models of such properties. Toward addressing model trustworthiness for design, we propose (i) a method for learning models that accounts for the distribution shifts inherent to design, and (ii) a method for constructing statistically valid confidence sets for the properties of objects designed using machine learning.Finally, we examine the trustworthy use of machine learning for drawing scientific conclusions. In particular, we consider the increasingly relevant setting of treating predictions made by machine-learning systems as “data” in estimating quantities of scientific interest. We propose prediction-powered inference, a novel statistical framework for constructing valid confidence sets in this setting, which enables researchers to incorporate evidence from machine-learning systems into their scientific inquiry in a standardized and principled way

eScholarship - University of California

Better images, fewer samples: Optimizing array configuration for compressed sensing in radio interferometry

Author: Clara Fannjiang
Högbom J. A.
Thompson A. R.
Tibshirani R.
Publication venue: 'Society of Exploration Geophysicists'
Publication date
Field of study

Crossref

Optimal arrays for compressed sensing in snapshot-mode radio interferometry

Author: Boone
Candès
Candès
Candès
Candès
Chen
Clara Fannjiang
Clark
Davis
Davis
Dollet
Donoho
Donoho
Fannjiang
Geršgorin
Högbom
Lannes
Li
Lo
McEwen
Polygiannakis
Starck
Starck
Tibshirani
Tropp
Wiaux
Publication venue: 'EDP Sciences'
Publication date
Field of study

Crossref

Area-only method for underwater object tracking using autonomous vehicles

Author: Aguzzi Jacopo
Bouvet P.J.
Fannjiang Clara
Gomáriz Spartacus
Katija Kakani
Kieft Brian
Masmitja I.
O'Reilly Tom
Río Joaquín del
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/03/2020
Field of study

OCEANS 2019, 17-20 June 2019, Marseille.-- 9 pages, 6 figures, 1 tableThe use of autonomous underwater vehicles for ocean research has increased as they have a better cost/performance ratio than crewed oceanographic vessels. For example, autonomous vehicles (e.g. a Wave Glider) can be used to localise and track underwater targets. Whereas other researchers have been focused on target tracking using acoustic modems, here we present a novelty method called area-only target tracking. This method works with commercially available acoustic tags, thereby reducing the costs and complexity over other tracking systems. Moreover, this method can be used to track small targets such as jellyfishes due to the tag's size. The methodology behind the area-only technique is shown, and results from the first field tests conducted in Monterey Bay area are also presented

Digital.CSIC